TclBLASR: an automatic speech recognition extension for tcl
نویسندگان
چکیده
We present TclBLASR, a framework to integrate a proprietary speech recognition engine, an open source script language, such as Tcl/Tk and an open source sound analysis toolkit, such as Snack from KTH, into a user friendly platform that a user can write a Tcl/Tk script application quickly for speech recognition evaluation, speech data collection and automatic annotation, and speech technology demonstration. This framework is extremely useful for third party customer evaluation of speech technologies that do not involve heavy C/C++ program development and extensive knowledge on low-level speech engine APIs. Using the Bell Labs Automatic Speech Recognition (BLASR) engine, coupled with the realtime audio I/O and visualization provided by Snack and the flexible graphical user interface tools embedded in Tcl/Tk, the TclBLASR platform proves to be a useful framework for quick packaging of ASR engines for customer evaluation of the technology without extensive customization of interfaces to meet different needs from a wide range of customers.
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملDesigning and implementing a system for Automatic recognition of Persian letters by Lip-reading using image processing methods
For many years, speech has been the most natural and efficient means of information exchange for human beings. With the advancement of technology and the prevalence of computer usage, the design and production of speech recognition systems have been considered by researchers. Among this, lip-reading techniques encountered with many challenges for speech recognition, that one of the challenges b...
متن کاملSAPPHIRE: an extensible speech analysis and recognition tool based on tcl/tk
The SAPPHIRE system is a powerful, extensible, object-oriented toolkit allowing researchers to rapidly build and configure customized speech analysis tools. Implemented in Tcl/Tk and C, the current version of SAPPHIRE provides a wide range of functionality, including the ability to configure and run the SUMMIT speech recognition system. We now use SAPPHIRE widely in almost all aspects of our sp...
متن کاملA language for creating speech applications
This paper describes an embedded Voice Interface Language (VIL) that enables the rapid prototyping and creation of applications requiring a voice interface. It can be integrated into popular script languages such as Perl or Tcl/Tk. Three levels of single-word commands enable the application designers to access basic speech processing technologies, such as automatic speech recognition and text-t...
متن کامل